Refining Aggregate Conditions in Relational Learning
نویسندگان
چکیده
In relational learning, predictions for an individual are based not only on its own properties but also on the properties of a set of related individuals. Many systems use aggregates to summarize this set. Features thus introduced compare the result of an aggregate function to a threshold. We consider the case where the set to be aggregated is generated by a complex query and present a framework for refining such complex aggregate conditions along three dimensions: the aggregate function, the query used to generate the set, and the threshold value. The proposed aggregate refinement operator allows a more efficient search through the hypothesis space and thus can be beneficial for many relational learners that use aggregates. As an example application, we have implemented the refinement operator in a relational decision tree induction system. Experimental results show a significant efficiency gain in comparison with the use of a less advanced refinement operator.
منابع مشابه
IRSJ: incremental refining spatial joins for interactive queries in GIS
An increasing number of emerging web database applications deal with large georeferenced data sets. However, exploring these large data sets through spatial queries can be very time and resource intensive. The need for interactive spatial queries has arisen in many applications such as Geographic Information Systems (GIS) for efficient decision-support. In this paper, we propose a new interacti...
متن کاملUsing neural networks for relational learning
Relational learners need to be able to handle the information contained in a set of related tuples. Most current relational learners are biased either towards the use of aggregate functions that summarize that set, or towards checking the existence of specific kinds of elements in that set. Learning patterns that contain a combination of both is a challenging task. In this paper we introduce a ...
متن کاملA Random Forest Approach to Relational Learning
Random forest induction is an ensemble method that uses a random subset of features to build each node in a decision tree. The method has been shown to work well when many features are available. This certainly is the case in relational learning, especially when aggregate functions, combined with selection conditions on the set to be aggregated, are included in the feature space. This paper pre...
متن کاملA Toolbox for Learning from Relational Data with Propositional and Multi-instance Learners
• uses SQL aggregate functions like SUM, MIN, MAX, AVG and computed standard deviation, quartile and range to capture relational information • for each value of a nominal column a new attribute is introduced, containing the number of occurrences • pairs of attributes (one is nominal) are used as GROUP BY conditions for additional aggregations • determines relations between tables based on name ...
متن کاملRefining Hygienic Macros for Modules and Separate Compilation
Genuine differences in the treatment of identifiers in block-structured languages and those that provide qualified names for accessing components of modules or aggregate data structures invalidate some of the assumptions hygienic macro systems are based on. We will investigate how these assumptions have to be changed, and the consequences for the construction of hygienic macro expanders. Macro ...
متن کامل